Short vector code generation and adaptation for DSP algorithms
نویسندگان
چکیده
Most recent general purpose processors feature short vector SIMD instructions, like SSE on Pentium III/4. In this paper we automatically generate platform-adapted short vector code for DSP transform algorithms using SPIRAL. SPIRAL represents and generates fast algorithms as mathematical formulas, and translates them into code. Adaptation is achieved by searching in the space of algorithmic and coding alternatives for the fastest implementation on the given platform. We explain the mathematical foundation that relates formula constructs to vector code, and overview the vector code generator within SPIRAL. Experimental results show excellent speed-ups compared to ordinary C code for a variety of transforms and computing platforms. For the DFT on Pentium 4, our automatically generated code compares favorably with the handtuned Intel MKL vendor library.
منابع مشابه
Short Vector SIMD Code Generation for DSP Algorithms
Short vector SIMD instructions on recent general purpose microprocessors, such as SSE on Pentium III and 4, offer a high potential speed-up but require a very high level of programming expertise. We present a compiler that generates vectorized code for digital signal processing algorithms such as the fast Fourier transform (FFT). The input to our compiler is a mathematical description of the al...
متن کاملAutomatic Generation of SIMD DSP Code
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. This report introduces a compiler that automatically generates C code enhanced with short vector instructions for digital signal processing (DSP) transforms, such as the fast Fourier transform (FFT). The input to the compiler is a concise ma...
متن کاملA SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler that automatically generates C code enhanced with short vector instructions for digital signal processing (DSP) transforms, such as the fast Fourier transform (FFT). The input to our compiler is a concise mathematical d...
متن کاملEfficient Implementation Methodology of Fast FIR Filtering Algorithms on DSP
A class of Finite Impulse Response (FIR) filtering algorithms based either on short Fast Fourier Transforms (FFT) or on short length FIR filtering algorithms was recently proposed. Besides the significant reduction of the arithmetic complexity, these algorithms present some characteristics which make them useful in many applications, namely a small delay processing (independent on the FIR filte...
متن کاملAutomatic Implementation and Platform Adaptation of Discrete Filtering and Wavelet Algorithms
Automatic Implementation and Platform Adaptation of Discrete Filtering and Wavelet Algorithms Aca Gačić José M. F. Moura, Markus Püschel Carnegie Mellon University 2004 Moore’s law, with the doubling of the transistor count every 18 months, poses serious challenges to high-performance numerical software designers: how to stay close to the maximum achievable performance on ever-changing and ever...
متن کامل